Search CORE

74 research outputs found

Making Entailment Set Changes Explicit Improves the Understanding of Consequences of Ontology Authoring Actions

Author: H Knublauch
H Wang
M Horridge
M Lee
M Vigo
P Lambrix
T Tudorache
TGO Consortium
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

The University of Manchester - Institutional Repository

Computation of significance scores of unweighted Gene Set Enrichment Analyses

Author: A Subramanian
A Zanzoni
Andreas Keller
C Backes
C Backes
Christina Backes
E Rubin
H Hermjakob
H Lee
Hans-Peter Lenhof
J Küntzer
J Lamb
L Salwinski
M Kanehisa
M Krull
S Kim
S Peri
S Wachi
T Barrett
TGO Consortium
V Matys
V Mootha
Y Benjamini
Y Hochberg
Z Jiang
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Gene Set Enrichment Analysis (GSEA) is a computational method for the statistical evaluation of sorted lists of genes or proteins. Originally GSEA was developed for interpreting microarray gene expression data, but it can be applied to any sorted list of genes. Given the gene list and an arbitrary biological category, GSEA evaluates whether the genes of the considered category are randomly distributed or accumulated on top or bottom of the list. Usually, significance scores (p-values) of GSEA are computed by nonparametric permutation tests, a time consuming procedure that yields only estimates of the p-values. Results We present a novel dynamic programming algorithm for calculating exact significance values of unweighted Gene Set Enrichment Analyses. Our algorithm avoids typical problems of nonparametric permutation tests, as varying findings in different runs caused by the random sampling procedure. Another advantage of the presented dynamic programming algorithm is its runtime and memory efficiency. To test our algorithm, we applied it not only to simulated data sets, but additionally evaluated expression profiles of squamous cell lung cancer tissue and autologous unaffected tissue.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A graph-search framework for associating gene identifiers with documents

Author: A Yeh
AM Cohen
AM Cohen
AM Cohen
C Zhai
Consortium TGO
D Hanisch
E Hatcher
E Minkov
E Minkov
Einat Minkov
F Sha
J Crim
K Franzén
K Fundel
K Humphreys
L Hirschman
L Hirschman
M Collins
M Craven
R Bunescu
RI Kondor
T Rindflesch
U Leser
William W Cohen
WW Cohen
WW Cohen
WW Cohen
Y Altun
Y Freund
Z Kou
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: One step in the model organism database curation process is to find, for each article, the identifier of every gene discussed in the article. We consider a relaxation of this problem suitable for semi-automated systems, in which each article is associated with a ranked list of possible gene identifiers, and experimentally compare methods for solving this geneId ranking problem. In addition to baseline approaches based on combining named entity recognition (NER) systems with a "soft dictionary" of gene synonyms, we evaluate a graph-based method which combines the outputs of multiple NER systems, as well as other sources of information, and a learning method for reranking the output of the graph-based method. RESULTS: We show that named entity recognition (NER) systems with similar F-measure performance can have significantly different performance when used with a soft dictionary for geneId-ranking. The graph-based approach can outperform any of its component NER systems, even without learning, and learning can further improve the performance of the graph-based ranking approach. CONCLUSION: The utility of a named entity recognition (NER) system for geneId-finding may not be accurately predicted by its entity-level F1 performance, the most common performance measure. GeneId-ranking systems are best implemented by combining several NER systems. With appropriate combination methods, usefully accurate geneId-ranking systems can be constructed based on easily-available resources, without resorting to problem-specific, engineered components

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Webulous and the Webulous Google Add-On - a web service and application for ontology building from templates

Author: A Gangemi
B Smith
D Welter
E Maguire
H Dietze
J Malone
J Malone
K Wolstencroft
M Horridge
ME Aranguren
N Kolesnikov
NF Noy
S Jupp
S Jupp
TGO Consortium
Z Xiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Classification of microarray data using gene networks

Author: A Sivachenko
Andrei Zinovyev
B Mohar
B Schölkopf
B Schölkopf
BE Boser
D Cavalieri
D Hanisch
D Hosack
Emmanuel Barillot
Franck Rapaport
FRK Chung
G Mercier
G Mercier
I Gat-Viks
I Jolliffe
J Rahnenfuhrer
J van Helden
JC Liao
Jean-Philippe Vert
JM Stuart
JP Vert
KR Curtis
Marie Dutreix
O Babur
O Radulescu
P Kharchenko
P Kharchenko
P Shannon
PD Karp
R Kelley
R Thomas
SJ Galbraith
T Breslin
T Hastie
TGO Consortium
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks in order to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation. RESULTS: We propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We illustrate the method with the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains. CONCLUSION: Including a priori knowledge of a gene network for the analysis of gene expression data leads to good classification performance and improved interpretability of the results

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL-MINES ParisTech

Comparative GO: a web application for comparative Gene Ontology and Gene Ontology-based gene selection in bacteria

Author: Abiodun D. Ogunniyi
AD Ogunniyi
CI Castillo-Davis
D Bogaert
D Martin
David L. Adelson
E Camon
EI Boyle
Esmaeil Ebrahimie
F Al-Shahrour
F Wilcoxon
G Dennis
GO Consortium
HB Mann
James C. Paton
KL O'Brien
Layla K. Mahdi
LK Mahdi
M Ashburner
MA Harris
MA Stephens
Mario Fruzangohar
Randen Lee Patterson
T Beissbarth
TGO Consortium
W Huang da
W Huang da
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Extent: 8p.The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO), which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria) from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s) of infection. It can also aid in the discovery of genes associated with specific function(s) for investigation as a novel vaccine or therapeutic targets.Mario Fruzangohar, Esmaeil Ebrahimie, Abiodun D. Ogunniyi, Layla K. Mahdi, James C. Paton, David L. Adelso

Public Library of Science (PLOS)

Crossref

Adelaide Research & Scholarship

Directory of Open Access Journals

PubMed Central

University of Southern Queensland ePrints

University of Melbourne Institutional Repository

Identifying dysfunctional crosstalk of pathways in various regions of Alzheimer's disease brains

Author: A Grigoriev
A Subramanian
A Zanzoni
C Alfarano
C Ballatore
C Lanni
C Stark
D Huang
D Selkoe
E Edelman
E Segal
E Tobinick
H Chuang
H Ge
H Hermjakob
H Pang
I Ulitsky
J Hardy
K Cheung
L Chen
Luonan Chen
M Francesconi
M Goedert
M Kanehisa
M Liu
M Wolfe
N Bhardwaj
P Dash
P Jonsson
R Fisher
R Jansen
R Majeti
S Peri
S Wachi
T Ideker
T Ideker
TGO Consortium
V Bolos
V Limviphuvadh
V Tusher
W Liang
X Zhao
Xiang-Sun Zhang
Y Li
Yong Wang
Z Guo
Zhi-Ping Liu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

Author: A Atfi
A Coppe
A Fadda
A Sandelin
A Subramanian
AB Georges
C Dieterich
C Huttenhower
D Boffelli
D Cora
DA Tagle
DJ Reiss
E Davidson
E Eskin
E Wingender
G Kreiman
G Robertson
G Thijs
GL Hager
GZ Hertz
H Le Pabic
HK Lee
JD Thompson
JM Vaquerizas
JS Michaloski
Jérémy Gruel
K Quandt
L Marino-Ramirez
M Blanchette
M Blanchette
M Endoh
M Kanehisa
M Kazemian
M Rebeiz
M Tompa
MC Frith
MC Frith
Michel LeBorgne
MM Babu
Nathalie Théret
Nolwenn LeMeur
O Hallikas
Q Zhou
RW Hamming
S Falcon
S Hannenhalli
T Knittel
TA Down
TGO Consortium
TL Bailey
VK Mootha
W Thompson
Y Halperin
YH Grad
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.</p

HAL-CentraleSupelec

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

HAL-Rennes 1

Geometric De-noising of Protein-Protein Interaction Networks

Author: A Kumar
A Labarga
AC Gavin
AL Barabasi
AM Edwards
C Bishop
C Stark
C von Mering
D Higham
D Higham
Desmond J. Higham
DS Han
F Abraham
G Bader
G Hart
G Mishra
GH Golub
H Chua
J Chen
J Rual
J Wang
J Yu
L Giot
M Kanehisa
M Penrose
Marija Rašajski
MS Lee
N Krogan
N Pržulj
N Pržulj
N Pržulj
N Pržulj
Nataša Pržulj
O Kuchaiev
Oleksii Kuchaiev
P Erdös
P Uetz
R Colak
R Jansen
R Singh
S Collins
S Li
S Pitre
S Suthram
T Cox
T Ito
T Milenkovic
Teresa Maria Przytycka
TGO Consortium
U Stelzl
XW Chen
Y Ho
Z Ma
Publication venue: Public Library of Science
Publication date: 01/08/2009
Field of study

Understanding complex networks of protein-protein interactions (PPIs) is one of the foremost challenges of the post-genomic era. Due to the recent advances in experimental bio-technology, including yeast-2-hybrid (Y2H), tandem affinity purification (TAP) and other high-throughput methods for protein-protein interaction (PPI) detection, huge amounts of PPI network data are becoming available. Of major concern, however, are the levels of noise and incompleteness. For example, for Y2H screens, it is thought that the false positive rate could be as high as 64%, and the false negative rate may range from 43% to 71%. TAP experiments are believed to have comparable levels of noise

Crossref

University of Strathclyde Institutional Repository

Directory of Open Access Journals

PubMed Central

UCL Discovery

eScholarship - University of California

Snazer: the simulations and networks analyzer

Author: A Neumann
A Raj
A Romanel
A Seary
AA Cuellar
B Breitkreutz
B Novak
C Cannataro
C Priami
C Schaefer
Consortium TGO
Corrado Priami
D Auber
D Krackhardt
E De Silva
E Dougherty
F Baudi
F Iragne
G Pavlopoulos
Gennaro Iaccarino
H Hermjakob
I Ben-Gal
J Kim
J Peterson
J Tyson
J Zámborszky
JE Hopcroft
K Claffy
K Van Gend
L Dematté
L Dematté
L Dematté
M Baitaluk
M Cannataro
M Cannataro
M Hucka
M Sabouri-Ghomi
P Ballarini
P Ballarini
P Hartman
P Shannon
R Douglas
R Lamprecht
R Milner
S Hooper
S Kauffman
S Sedwards
SA Kauffman
T Freeman
T Fruchterman
Tommaso Mazza
U Brandes
U Erra
VL Katanaev
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Networks are widely recognized as key determinants of structure and function in systems that span the biological, physical, and social sciences. They are static pictures of the interactions among the components of complex systems. Often, much effort is required to identify networks as part of particular patterns as well as to visualize and interpret them. From a pure dynamical perspective, simulation represents a relevant <it>way</it>-<it>out</it>. Many simulator tools capitalized on the "noisy" behavior of some systems and used formal models to represent cellular activities as temporal trajectories. Statistical methods have been applied to a fairly large number of replicated trajectories in order to infer knowledge. A tool which both graphically manipulates reactive models and deals with sets of simulation time-course data by aggregation, interpretation and statistical analysis is missing and could add value to simulators. Results We designed and implemented <it>Snazer</it>, the simulations and networks analyzer. Its goal is to aid the processes of visualizing and manipulating reactive models, as well as to share and interpret time-course data produced by stochastic simulators or by any other means. Conclusions <it>Snazer </it>is a solid prototype that integrates biological network and simulation time-course data analysis techniques.</p

Crossref

Directory of Open Access Journals

PubMed Central

Archivio della Ricerca - Università di Pisa